Skip to content

Synthetic Dataset: Prefix Caching Controls #183

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Aug 20, 2025

Conversation

sjmonson
Copy link
Collaborator

@sjmonson sjmonson commented Jun 10, 2025

Summary

Work to allow control of token prefix cache rates with the synthetic data generator. Firstly adds an auto-incrementing single token prefix to ensure we never repeat the same prefix. Secondly adds controls for sharing a fixed prefix between samples.

Details

1. Ensure every prompt is unique

When generating a prompt, the first token is now taken from an iterator over the tokenizer vocab.

2. Add configurable prefix to simulate system prompt or other common token prefixes

Example usage:

data:
  prefix_tokens: 2048
  prompt_tokens: 256,
  output_tokens: 256,
  samples: 1024

Test Plan

  • PR includes unit tests for all synthetic dataset changes (pytest tests/unit/dataset)
  • Scenario in the Details section can be used against a model server with prefix caching and the cache rate can be confirmed by inspecting console output.

Related Issues


  • "I certify that all code in this PR is my own, except as noted below."

Use of AI

  • Includes AI-assisted code completion
  • Includes code generated by an AI application
  • Includes AI-generated tests (NOTE: AI written tests should have a docstring that includes ## WRITTEN BY AI ##)

@sjmonson sjmonson requested review from markurtz and Copilot June 10, 2025 19:39
Copilot

This comment was marked as outdated.

@sjmonson sjmonson requested a review from Copilot June 10, 2025 19:41
Copilot

This comment was marked as outdated.

@dagrayvid dagrayvid changed the title Sythetic Dataset: Support setting a shared prompt prefix. Synthetic Dataset: Support setting a shared prompt prefix. Jun 11, 2025
@markurtz markurtz added this to the v0.3.0 milestone Aug 13, 2025
@sjmonson sjmonson force-pushed the feat/fixed_prefix branch 2 times, most recently from 9c12d0b to 94a4508 Compare August 14, 2025 19:50
@sjmonson sjmonson force-pushed the feat/fixed_prefix branch 3 times, most recently from 20b660b to 081cdab Compare August 19, 2025 19:49
sjmonson and others added 5 commits August 19, 2025 15:50
Signed-off-by: Samuel Monson <smonson@redhat.com>
Signed-off-by: Samuel Monson <smonson@redhat.com>
Signed-off-by: Samuel Monson <smonson@redhat.com>
Co-authored-by: Mehul <MEHTMEHUL@GMAIL.COM>
Co-authored-by: Samuel Monson <smonson@redhat.com>
Signed-off-by: Samuel Monson <smonson@redhat.com>
Signed-off-by: Samuel Monson <smonson@redhat.com>
@sjmonson sjmonson force-pushed the feat/fixed_prefix branch 2 times, most recently from 473d097 to 692589c Compare August 19, 2025 20:17
@sjmonson sjmonson changed the title Synthetic Dataset: Support setting a shared prompt prefix. Synthetic Dataset: Prefix Caching Controls Aug 19, 2025
@sjmonson sjmonson requested a review from Copilot August 19, 2025 21:29
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements prefix caching controls for the synthetic dataset generator by adding configurable prefix buckets and ensuring unique prompts. The implementation allows control of token prefix cache rates through shared prefixes across samples while maintaining prompt uniqueness.

Key changes include:

  • Added PrefixBucketConfig for configurable prefix generation with bucket weights, prefix counts, and token lengths
  • Modified prompt generation to include auto-incrementing unique prefixes and configurable shared prefixes
  • Updated documentation to reflect the new prefix configuration options

Reviewed Changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 1 comment.

File Description
src/guidellm/dataset/synthetic.py Core implementation of prefix bucket configuration and modified prompt generation logic
src/guidellm/dataset/init.py Exported new PrefixBucketConfig class
tests/unit/dataset/test_synthetic.py Comprehensive test suite for new prefix functionality
docs/datasets.md Updated documentation with prefix_tokens parameter
Comments suppressed due to low confidence (1)

docs/datasets.md:79

  • The documentation describes a prefix_tokens parameter, but the implementation uses prefix_buckets with a more complex structure. This documentation appears to be outdated or incorrect for the current implementation.
- `prefix_tokens`: Number of tokens to share as a prefix across all prompts. Is additive to the prompt tokens distribution so each request is `prefix_tokens + prompt_tokens_sample()`. If unset, defaults to 0.

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@sjmonson sjmonson merged commit ec328c1 into vllm-project:main Aug 20, 2025
18 checks passed
@sjmonson sjmonson deleted the feat/fixed_prefix branch August 20, 2025 17:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ensure prefixes cannot be cached for synthetic datasets [Feature Request] Testing with defined prefix lengths
3 participants